Background

Historically, hundreds of thousands of chum salmon (Oncorhynchus keta) adults returned annually to the Lower Columbia River (LCR). However, much like other salmon populations throughout the Pacific Northwest, the overall abundance of LCR chum has declined precipitously as a result of anthropogenic impacts and climatic changes. In 1999, Columbia River chum salmon were listed as Threatened under the Endangered Species Act (ESA), prompting state and federal agencies to begin recovery efforts. Specifically, the Washington Department of Fish and Wildlife (WDFW) began implementing a chum recovery program that uses a three-pronged approach, which includes habitat restoration, hatchery supplementation, and monitoring and evaluation. Each of these prongs has been implemented for 15–20 years, but the resulting data sets have not previously been integrated to understand the patterns and drivers of temporal and spatial variability in chum abundance and productivity.

In 2016, WDFW proposed the development of a life-cycle model (LCM) to improve our understanding of LCR chum population dynamics and to provide a platform for future analyses that could: (1) help identify limiting life stages through time and space, (2) help identify criteria to assess the need for supplementation, and (3) help prioritize habitat restoration by identifying the features of viable chum habitat patches.

Here we describe a stage-structured, multi-population LCM for LCR chum formulated as an integrated population model (IPM), in which a joint likelihood function allows the model to be fitted to all available data at once (Schaub and Abadi 2011). IPMs are the state-of-the-art framework for parameterizing structured population models and have been used widely in fisheries and wildlife ecology because they offer numerous advantages over an ad hoc, piecemeal approach, especially when data are sparse or heterogeneous (Newman et al. 2006, Su and Peterman 2012, Fleischman et al. 2013, Maunder and Punt 2013). The heart of an IPM is a state-space model, a form of hierarchical model that distinguishes true ecological process stochasticity from observation noise (Valpine and Hilborn 2005).

Our LCR chum IPM is developed as part of the R package salmonIPM (Buhle et al. 2018), which facilitates fitting a variety of age- and stage-structured IPMs tailored to anadromous salmonid life histories. This platform allows a reproducible workflow from data to model-fitting and management scenario analysis, which can be easily updated as new data become available. Models in salmonIPM are implemented in Stan (Stan Development Team 2019), a probabilistic programming language for Bayesian estimation that uses the cutting-edge Hamiltonian Monte Carlo (HMC) algorithm to draw samples from the posterior distribution (Monnahan et al. 2017).

Development of the LCR chum IPM is an iterative process, starting with the generic spawner-to-spawner and spawner-smolt-spawner models available in salmonIPM and incrementally adding customizations to accommodate the complexity of population and data structures and to enable the model to more directly answer questions of management concern. In previous years these have included, e.g., external sample-based estimates of spawner and smolt observation error; using fecundity and sex structure data to estimate egg-to-smolt survival; and representing adult translocations into artificial spawning channels. In 2023 for the first time we are explicitly modeling the hatchery smolt-to-adult recruitment process, replacing the more flexible but non-mechanistic generic model of hatchery-origin spawner abundance. The new approach provides estimates of smolt-to-adult survival (SAR) for hatchery populations as well as the probabilities of hatchery-origin recruits straying to each natural population. It also allows us to evaluate alternative scenarios of future hatchery operations; we present a preliminary set of such simulations below.

This vignette first details the structure of the LCR chum IPM, including the process model (the true underlying population dynamics), the observation model (the likelihood of data arising from field monitoring programs given the unknown true state), and the priors. We then demonstrate how to fit LCR chum models using salmonIPM and discuss and interpret the main results, including reporting metrics of interest to managers. We conclude by outlining our plans for further model development. The goal of the vignette is to provide an introduction to both the mathematical structure of the model and the workflow involved in fitting it in R and working with the output. To that end, we display a minimal amount of R code, but all the code used in the examples shown here can be found in the RMarkdown source used to generate this document.

Model Structure

Process Model

Egg Deposition and Egg-to-Smolt Survival

The chum life cycle in natural populations begins with spawning, egg incubation and fry emergence, and shortly thereafter the downstream migration of juveniles (which we refer to as smolts). In salmonIPM we can fit three alternative spawner-recruit functions to describe the expected relationship between spawner abundance \(S_{jt}\) and smolt abundance \(M_{jt}\) from brood year \(t\) in population \(j\): density-independent discrete exponential, Beverton-Holt, and Ricker:

\[ f {\left( S_{jt} | \alpha_{jt}, M_{\text{max},j}, A_{jt} \right)} = \begin{cases} \alpha_{jt} S_{jt} & \text{exponential} \\ \dfrac{ \alpha_{jt} S_{jt} }{ 1 + \dfrac{ \alpha_{jt} S_{jt} }{ A_{jt} M_{\text{max},j} }} & \text{Beverton-Holt} \\ \alpha_{jt} S_{jt} \text{exp} {\left(- \dfrac{ \alpha_{jt}S_{jt} }{ \text{exp}(1) A_{jt} M_{\text{max},j} } \right)} & \text{Ricker} \end{cases} \]

We use a nonstandard parameterization of the Ricker by maximum smolt production or “capacity” \(M_{\text{max}}\), corresponding to the mode of the function. This facilitates direct comparison with the Beverton-Holt (where \(M_{\text{max}}\) is the asymptote) and is better-identified by data than the standard parameterization based on per capita density dependence. \(M_{\text{max}}\) has units of density (fish per stream distance or area) and is then expanded to units of abundance based on habitat size \(A_{jt}\). In the case of LCR chum, this is km of spawning habitat estimated using GIS.

Spawner-to-smolt intrinsic productivity \(\alpha_{jt}\) is the weighted mean of age-specific female fecundity \(\mu_{Ea}\), weighted by the spawner age distribution \(q_{jta}\), multiplied by the proportion of female spawners \(q_{\text{F}jt}\), discounted for the proportion of females that are not “green” (i.e. fully fecund) with discount rate \(\delta_\text{NG} \in [0,1]\), and finally multiplied by the density-independent maximum egg-to-smolt survival \(\psi_j\):

\[ \alpha_{jt} = \psi_j q_{Fjt} \left[p_{\text{G}jt} + \delta_\text{NG} {\left(1 - p_{\text{G}jt} \right)} \right] \sum_{a=3}^{5} q_{jta} \mu_{Ea}. \]

The discount for partially spawned females is only relevant for one population, Duncan Channel, a constructed spawning channel in which some translocated females are believed to have already deposited eggs elsewhere. We assume the proportion of “green” females \(p_{\text{G}jt}\) is known without error since translocated spawners are individually handled and visually assessed.

This formulation implicitly assumes egg deposition is density-independent while egg-to-smolt survival is density-dependent, resulting in realized survival \(< \psi_j\) as spawner density increases. Maximum egg-to-smolt survival varies randomly among populations according to the hyperdistribution

\[ \text{logit}{\left( \psi_j \right)} \sim \text{N}{\left(\mu_\psi + \mathbf{x}_{\psi jt} \boldsymbol{\beta}_\psi, \sigma_\psi \right)} \]

and maximum smolt production varies randomly among populations according to the hyperdistribution

\[ \text{log}{\left( M_{\text{max},j} \right)} \sim \text{N}{\left( \mu_{M_\text{max}} + \mathbf{x}_{M_\text{max},jt} \boldsymbol{\beta}_{M_\text{max}}, \sigma_{M_\text{max}} \right)}. \]

Covariates can be incorporated into the hierarchical model for each spawner-recruit parameter \(\theta\) via a \(1 \times K_\theta\) vector of predictors \(\mathbf{x}_{\theta jt}\) with corresponding \(K_\theta \times 1\) coefficient vector \(\boldsymbol{\beta}_\theta\). We do not consider this option further here, but we are exploring the use of covariates to represent freshwater habitat conditions.

Smolt Recruitment Process Error

Given expected smolt production, realized smolt production (the unknown “true state”) is lognormally distributed with process errors that include a common ESU-level autoregressive trend \(\eta^\text{year}_{Mt}\) with first-order autocorrelation coefficient \(\rho_{M}\) and innovation SD \(\sigma^\text{year}_{M}\) plus unique independent shocks \(\varepsilon_{Mjt}\):

\[ \begin{aligned} M_{jt} &= f {\left( S_{jt} | \alpha_{jt}, M_{\text{max},j}, A_{jt} \right)} \ \text{exp}{\left(\mathbf{x}_{Mjt} \beta_M + \eta^\text{year}_{Mt} + \varepsilon_{Mjt}\right)} \\ \eta^\text{year}_{Mt} &\sim \text{N}{\left( \rho_{M} \eta^\text{year}_{M,t-1}, \sigma^\text{year}_{M} \right)} \\ \varepsilon_{Mjt} &\sim \text{N}{\left( 0, \sigma_{M} \right)}. \end{aligned} \]

The process model for smolt recruitment can include covariates specified via a row vector of predictors \(\mathbf{x}_{Mjt}\) with corresponding regression coefficient vector \(\boldsymbol{\beta}_M\). In contrast to the spawner-recruit parameters, covariates of the smolt recruitment process error would represent density-independent effects taking place after density dependence in egg-to-smolt survival.

Smolt-to-Adult Survival

Smolt-to-adult survival (SAR, \(s_{MS}\)) is assumed to be density-independent. The SAR process model for outmigrant cohort \(t\) in population \(j\) is logistic normal, including a common ESU-level first-order autoregressive trend \(\eta^\text{year}_{MSt}\) with autocorrelation coefficient \(\rho_{MS}\) and innovation SD \(\sigma^\text{year}_{MS}\) plus unique independent shocks \(\varepsilon_{MSjt}\):

\[ \begin{aligned} \text{logit}{\left( s_{MSjt} \right)} &= \text{logit}{\left( \mu_{MS} + \mathbf{x}_{MSjt} \boldsymbol{\beta}_{MS} + \eta^\text{year}_{MSt} + \varepsilon_{MSjt} \right)} \\ \eta^\text{year}_{MSt} &\sim \text{N}{\left(\rho_{MS} \eta^\text{year}_{MS,t-1}, \sigma^\text{year}_{MS} \right)} \\ \varepsilon_{MSjt} &\sim \text{N}{\left( 0, \sigma_{MS} \right)}. \end{aligned} \]

As with the smolt recruitment process error, the SAR process model can accommodate spatiotemporally varying covariates \(\mathbf{x}_{MSjt}\) with coefficients \(\boldsymbol{\beta}_{MS}\). We use this option below to model the difference between average SAR of natural and hatchery populations.

Conditional Age-at-Return

Adult age structure is modeled by defining a vector of conditional probabilities, \(\mathbf{p}_{jt} = \begin{bmatrix} p_{3jt} \ p_{4jt} \ p_{5jt} \end{bmatrix} ^ \top\), where \(p_{ajt}\) is the probability that an outmigrant in year \(t\) in population \(j\) returns at age \(a\) given that it survives to adulthood. The unconditional age-at-return probability is given by \(s_{MSjt} p_{ajt}\). Both SAR and conditional age-at-return can be written as functions of underlying annual marine survival and maturation probabilities that are nonidentifiable without some ancillary data; this parameterization resolves the nonidentifiability.

The conditional age probabilities follow a multivariate logistic normal process model with hierarchical structure across populations and through time within each population. The additive log ratio vector,

\[ \text{alr}{\left( \mathbf{p}_{jt} \right)} = \begin{bmatrix} \text{log}{\left( \dfrac{p_{3jt}}{p_{5jt}} \right)} \ \ \text{log}{\left( \dfrac{p_{4jt}}{p_{5jt}} \right)} \end{bmatrix} ^ \top \]

has a hierarchical bivariate normal distribution with population-level random effects \(\boldsymbol{\eta}^\text{pop}_{\mathbf{p}j}\) and unique residuals \(\boldsymbol{\varepsilon}_{\mathbf{p}jt}\):

\[ \begin{aligned} \text{alr}{\left( \mathbf{p}_{jt} \right)} &= \text{alr}{\left( \boldsymbol{\mu}_\mathbf{p} \right)} + \boldsymbol{\eta}^\text{pop}_{\mathbf{p}j} + \boldsymbol{\varepsilon}_{\mathbf{p}jt} \\ \boldsymbol{\eta}^\text{pop}_{\mathbf{p}j} &\sim \text{N}{\left( \mathbf{0}, \boldsymbol{\Sigma}^\text{pop}_\mathbf{p} \right)} \\ \boldsymbol{\varepsilon}_{\mathbf{p}jt} &\sim \text{N}{\left( \mathbf{0}, \boldsymbol{\Sigma}_\mathbf{p} \right)}. \end{aligned} \]

Here the 2 \(\times\) 2 covariance matrices \(\boldsymbol{\Sigma}^\text{pop}_\mathbf{p}\) and \(\boldsymbol{\Sigma}_\mathbf{p}\), parameterized by correlations \(\rho^\text{pop}_\mathbf{p}\) and \(\rho_\mathbf{p}\) and SD vectors \(\boldsymbol{\sigma}^\text{pop}_\mathbf{p}\) and \(\boldsymbol{\sigma}_\mathbf{p}\) respectively, allow correlated variation among age classes (on the unconstrained scale, not merely due to the simplex constraint on \(\mathbf{p}\)) across populations and over time within a population, respectively. For example, populations or cohorts may skew overall younger or older than average.

Sex Structure

Adult sex structure is modeled as the conditional probability \(p_{\text{F}jt}\) that an outmigrant from population \(j\) in year \(t\) is female, given that it survives to adulthood. The proportion of females follows a process model that includes normally distributed population-specific random effects \(\eta^\text{pop}_{\text{F}}\) with hyper-SD \(\sigma^\text{pop}_\text{F}\) and unique residuals \(\varepsilon^\text{pop}_{\text{F}}\) with SD \(\sigma_\text{F}\) around the hyper-mean \(\mu_\text{F}\):

\[ \begin{aligned} \text{logit}{\left( p_{\text{F}jt} \right)} &= \text{logit}{\left( \mu_\text{F} \right)} + \eta^\text{pop}_{\text{F}j} + \varepsilon_{\text{F}jt} \\ \eta^\text{pop}_{\text{F}j} &\sim \text{N}{\left( 0, \sigma^\text{pop}_\text{F} \right)} \\ \varepsilon_{\text{F}jt} &\sim \text{N}{\left( 0, \sigma_\text{F} \right)}. \end{aligned} \]

Hatchery Smolt-to-Adult Survival and Dispersal

In the current version of the IPM, the process model of the hatchery chum life cycle begins with smolts released from (hatchery) population \(h\) in year \(t\). The reported release size \(M^\text{obs}_{ht}\) is taken as known without error since it is relatively precisely estimated, especially compared to smolt abundance in natural populations. The released juveniles are then subject to SAR \(s_{MSht}\) and conditional age-at-return probabilities \(\mathbf{p}_{ht}\) as described above. Adult recruits from hatchery \(h\) return to natural population \(j\) with probability \(p_{\text{origin},hj}\). Given the relatively sparse annual recoveries of known-origin spawners, we do not attempt to model interannual variation in the straying probabilities; \(\mathbf{P}_\text{origin}\) is time-invariant. The probabilities can be written compactly as the straying matrix \(\mathbf{P}_\text{origin}\), whose rows corresponding to hatchery sources sum to 1 over destinations in the set of natural or wild populations \(\mathcal{W}\):

\[ \sum_{j \in \mathcal{W}}{p_{\text{origin},hj}} = 1. \] This implies that all hatchery-origin adults return to populations in the modeled set \(\mathcal{W}\). To the extent that fish from hatchery \(h\) return to sites outside this set, the effect will be to reduce the corresponding estimates of \(s_{MSht}\) (of course, the same is true for the wild populations). In that sense \(s_{MS}\) should be regarded as apparent smolt-to-adult survival. The set of modeled hatcheries \(\mathcal{H}\) currently includes Duncan Hatchery, Lewis Hatchery and Grays Hatchery. Although Duncan Channel-origin adults are also identifiable, we do not yet model straying from naturally reproducing populations. This will be added in a future iteration of the IPM (see Next Steps).

Adult Recruitment and Composition

Recruitment of wild-origin adult spawners \(S_\text{W}\) is calculated by summing over the corresponding outmigrant cohorts and subtracting the number of spawners removed for hatchery broodstock or to spawning channels (\(B_{jt}\), assumed known without error). Fishery mortality of LCR chum is believed to be negligible.

\[ S_{\text{W}jt} = \left(\sum_{a=3}^{5} M_{j,t-a} \ s_{MSj,t-a} \ p_{aj,t-a} \right) - B_{jt} = \left(\sum_{a=3}^{5} \tilde{S}_{\text{W}ajt} \right) - B_{jt}. \]

Natural spawner age structure is given by \(\mathbf{q}_{jt} = \begin{bmatrix} q_{3jt} \ q_{4jt} \ q_{5jt} \end{bmatrix} ^ \top\), where \(q_{ajt} = \tilde{S}_{\text{W}ajt} / \sum_a{\tilde{S}_{\text{W}ajt}}\). Natural spawner sex structure is given by the age-weighted average of female proportions from the respective outmigrant cohorts: \(q_{\text{F}jt} = \sum_{a} {q_{ajt} \hspace{0.1cm} p_{\text{F}j,t-a}}\). The abundance of hatchery-origin spawners \(S_\text{H}\) is the sum over \(h \in \mathcal{H}\) of the number of adult recruits from hatchery \(h\) that returned to population \(j\):

\[ S_{\text{H}jt} = \sum_{h \in \mathcal{H}} {\sum_{a=3}^{5} {M^\text{obs}_{h,t-a} \ s_{MSh,t-a} \ p_{ah,t-a} \ p_{\text{origin},hj}}} = \sum_{h \in \mathcal{H}}{\tilde{S}_{\text{H}hjt}}. \]

Spawner origin-composition is given by \(\mathbf{q}_{\text{origin},jt} = \begin{bmatrix} q_{\text{W}jt} \ q_{\text{H}1jt} \ q_{\text{H}2jt} \ q_{\text{H}3jt} \end{bmatrix} ^ \top\), where \(q_{\text{W}jt} = S_{\text{W}jt} / (S_{\text{W}jt} + S_{\text{H}jt})\) and \(q_{\text{H}hjt} = \tilde{S}_{\text{H}hjt} / (S_{\text{W}jt} + S_{\text{H}jt})\). The total proportion of hatchery-origin spawners is \(\sum_{h \in \mathcal{H}}{q_{\text{H}hjt}}\). Total spawner abundance is then

\[ S_{jt} = S_{\text{W}jt} + S_{\text{H}jt} + S^\text{add}_{jt} \]

where \(S^\text{add}_{jt}\) is the number of spawners that returned to populations \(i \neq j\) and were translocated into population \(j\). This is only nonzero in the case of Duncan Channel, and is assumed to be known without error.

Observation Model

Fecundity

We modeled observations of fecundity from individual female chum salmon collected at hatcheries. The observation likelihood for the fecundity of female \(i\) of age \(a\) is a zero-truncated normal with age-specific mean and SD:

\[ E_{ai}^\text{obs} \sim \text{N}_+ {\left( \mu_{Ea}, \sigma_{Ea} \right)}. \]

Smolt and Spawner Abundance

Empirical estimates of smolt and spawner abundance come from monitoring programs that use various sampling methods (e.g., rotary screw traps for outmigrants and weirs or mark-recapture for adults) and statistical models (e.g., the BTSPAS R package or other Bayesian time-series models) to produce a posterior distribution of the abundance of a given life-history stage in each population and year. These methods have previously been applied and their output summarized by the posterior mean and SD. We used the posterior moments of abundance to compute the log-mean \(\mu_{Ljt}\) and log-SD \(\tau_{Ljt}\) for each life stage \(L\) assuming the posterior is lognormal, which appears to be reasonable in general. We then used the observation-model posteriors as informative priors on the corresponding state variables in the IPM.

One complication is that outmigrant monitoring sites do not always have a one-to-one correspondence with independent populations. Specifically, the smolt trap in the mainstem Grays River intercepts juveniles produced in the mainstem population itself as well as the upstream West Fork Grays River and Crazy Johnson Creek populations, the latter of which is also sampled with its own trap. To make the states comparable to the observations, we defined a derived state \(\tilde{M}_{jt}\) that is the sum of smolts produced in population \(j\) and all upstream populations, assuming similar mortality en route to the trap.

\[ \begin{aligned} \text{log}{\left( \tilde{M}_{jt} \right)} &\sim \text{N}{\left( \mu_{\tilde{M}jt}, \tau_{\tilde{M}jt} \right)} \\ \text{log}{\left(S_{jt} \right)} &\sim \text{N}{\left(\mu_{Sjt}, \tau_{Sjt} \right)}. \end{aligned} \]

The IPM framework naturally handles missing observations, but in some cases an estimate of the observation prior log-mean \(\mu_{Ljt}\) was available while the observation error log-SD \(\tau_{Ljt}\) was missing or unknown. These missing values were imputed within the IPM by fitting a lognormal hyperdistribution to the known log-SDs:

\[ \begin{aligned} \text{log}{\left(\tau_{Mij}\right)} &\sim \text{N}{\left( \mu_{\tau_M}, \sigma_{\tau_M} \right)} \\ \text{log}{\left( \tau_{Sij} \right)} &\sim \text{N}{\left( \mu_{\tau_S}, \sigma_{\tau_S} \right)}. \end{aligned} \]

Spawner Age Composition

Spawner age-composition data are collected from scale samples in each population and year (the IPM framework naturally handles cases in which no age samples are available). Age frequencies of wild spawners \(\mathbf{n}_{ajt}^\text{obs} = \begin{bmatrix} n_{3jt}^\text{obs} \ n_{4jt}^\text{obs} \ n_{5jt}^\text{obs} \end{bmatrix} ^\top\) are assumed to follow a multinomial observation likelihood with expected probabilities given by the true state:

\[ \mathbf{n}_{ajt}^\text{obs} \sim \text{Multinomial}{\left( \sum_{a=3}^5 n_{ajt}^\text{obs}, \mathbf{q}_{jt} \right)}. \]

Spawner Sex Ratio

The observed frequency of female spawners is modeled with a binomial observation likelihood given the true proportion and the total sample size that were sexed:

\[ n_{\text{F}jt}^\text{obs} \sim \text{Bin}{\left( n_{\text{M}jt}^\text{obs} + n_{\text{F}jt}^\text{obs}, q_{\text{F},jt} \right)}. \]

Spawner Origin Composition

Frequencies of hatchery-origin spawners from each program are estimated based on otolith marking or genetic parentage-based tagging (PBT). Together with the frequency of unknown natural-origin spawners, they are assumed to follow a multinomial observation likelihood with expected probabilities given by the true origin distribution:

\[ \mathbf{n}_{\text{origin},jt}^\text{obs} \sim \text{Multinomial}{\left( \sum \mathbf{n}_{\text{origin},jt}^\text{obs}, \mathbf{q}_{\text{origin},jt} \right)}. \]

Priors

To complete the model specification, we need priors on all top-level or hyperparameters as well as initial states that are not generated by the process model. In the list of parameter priors below, the power-exponential (also known as generalized normal or Subbotin) density

\[ \text{PowerExp}(x | u,s,r) = \dfrac{r}{2 s \Gamma(1/r)} \text{exp}{\left[ - \left( \dfrac{|x - u|}{s} \right)^r \right]} \]

is used to regularize autocorrelation coefficients away from the computationally problematic boundaries [-1,1]. Autocorrelations are declared to have support on the stationary region, so Stan truncates the distribution. The prior mean and SD of log-mean smolt capacity, \(m_{M_\text{max}}\) and \(s_{M_\text{max}}\), are the sample maximum and SD of observed log smolt density, respectively. While this data-aware prior technically uses the data twice, it is broad and weakly informative and merely serves to find a reasonable scale for overall population size, similar to widely used Bayesian inference packages.

Hyperparameter Definition (units) Prior
\(\boldsymbol{\mu}_{E}\) mean fecundity by female age \(\text{N}_+(2500,500)\)
\(\boldsymbol{\sigma}_{E}\) SD of fecundity by female age \(\text{N}_+(500,1000)\)
\(\delta_{\text{NG}}\) non-green female fecundity discount \(\text{U}(0,1)\)
\(\mu_\psi\) mean egg-to-fry survival \(\text{U}(0,1)\)
\(\sigma_\psi\) logit-SD of egg-to-fry survival \(\text{N}_+(0,1)\)
\(\mu_{M_\text{max}}\) log-mean maximum smolt production (m-1) \(\text{N}(m_{M_\text{max}}, s_{M_\text{max}})\)
\(\sigma_{M_\text{max}}\) log-SD of maximum smolt production (m-1) \(\text{N}_+(0,3)\)
\(\rho_M\) autocorrelation of shared smolt recruitment process error \(\text{PowerExp}(0,0.85,20)\)
\(\sigma^\text{year}_M\) log-SD of shared smolt recruitment process error \(\text{N}_+(0,2)\)
\(\sigma_M\) log-SD of unique smolt recruitment process error \(\text{N}_+(0,2)\)
\(\mu_{MS}\) mean SAR \(\text{U}(0,1)\)
\(\beta_{MS}\) effect of hatchery rearing on logit SAR \(\text{N}(0,3)\)
\(\rho_{MS}\) autocorrelation of shared SAR process error \(\text{PowerExp}(0,0.85,20)\)
\(\sigma^\text{year}_{MS}\) logit-SD of shared SAR process error \(\text{N}_+(0,2)\)
\(\sigma_{MS}\) logit-SD of unique SAR process error \(\text{N}_+(0,2)\)
\(\boldsymbol{\mu}_\mathbf{p}\) mean conditional probabilities of age-at-return \(\text{Dirichlet}(\mathbf{1})\)
\(\boldsymbol{\sigma}^\text{pop}_\mathbf{p}\) among-pop SDs of log ratio age-at-return probabilities \(\text{N}_+(0,2)\)
\(\boldsymbol{\sigma}_\mathbf{p}\) within-pop SDs of log ratio age-at-return probabilities \(\text{N}_+(0,2)\)
\(\rho_\mathbf{p}^\text{pop}\) among-pop correlation of log ratio age-at-return probabilities \(\text{U}(-1,1)\)
\(\rho_\mathbf{p}\) within-pop correlation of log ratio age-at-return probabilities \(\text{U}(-1,1)\)
\(\mu_\text{F}\) mean proportion female \(\text{U}(0,1)\)
\(\sigma^\text{pop}_\text{F}\) among-pop logit-SD of proportion female \(\text{N}_+(0,2)\)
\(\sigma_\text{F}\) within-pop logit-SD of proportion female \(\text{N}_+(0,2)\)
\(\mathbf{P}_\text{origin}\) hatchery-to-natural straying matrix (prior is applied by row) \(\text{Dirichlet}(\mathbf{1})\)
\(\mu_{\tau_M}\) log-mean of smolt observation error log-SD \(\text{N}(0,1)\)
\(\sigma_{\tau_M}\) log-SD of smolt observation error log-SD \(\text{N}_+(0,5)\)
\(\mu_{\tau_S}\) log-mean of spawner observation error log-SD \(\text{N}(0,1)\)
\(\sigma_{\tau_S}\) log-SD of spawner observation error log-SD \(\text{N}_+(0,5)\)

Finally, lognormal priors on initial smolt abundance (in year 1 of each population time series) and spawner abundance (in years 1-3) are also data-aware but weakly informative, with parameters equal to the sample log-mean and 2 times the sample log-SD of all observed abundances, respectively. The prior on spawner age composition in years 1-3 is simplex-uniform over the space of “orphan” age classes, i.e. those too old to have parent brood years included in the process model. The proportion of females in the same orphan age classes is given a \(\text{Beta}(3,3)\) prior to weakly regularize toward an equal sex ratio.

Data Structure

User inputs needed to fit the LCR chum IPM include, first, a data frame containing observations and associated information organized by population and year. We can look at the subset of rows of fish_data for the Crazy Johnson Creek population (coded Grays CJ) to see what column names and formats salmonIPM expects:

pop pop_type year A S_obs tau_S_obs M_obs tau_M_obs downstream_trap n_age3_obs n_age4_obs n_age5_obs n_W_obs n_H_obs n_origin0_obs n_origin6_obs n_origin11_obs n_origin14_obs n_M_obs n_F_obs p_G_obs B_take_obs S_add_obs F_rate
Grays CJ natural 2004 644 1051 0.16 NA NA 249 4 26 1 31 0 31 0 0 0 22 9 1 0 0 0
Grays CJ natural 2005 644 1418 0.03 NA NA 250 56 101 40 197 12 197 0 0 12 85 124 1 0 0 0
Grays CJ natural 2006 644 3819 0.05 NA NA 251 65 250 9 324 13 324 0 0 13 176 161 1 0 0 0
Grays CJ natural 2007 644 870 0.13 NA NA 252 12 77 14 103 4 103 0 0 4 55 52 1 0 0 0
Grays CJ natural 2008 644 1093 0.13 NA NA 253 45 78 4 127 13 127 0 0 13 57 83 1 0 0 0
Grays CJ natural 2009 644 996 0.03 NA NA 254 74 26 2 102 3 102 0 0 3 47 58 1 0 0 0
Grays CJ natural 2010 644 865 0.13 NA NA 255 11 65 0 76 2 76 0 0 2 39 39 1 0 0 0
Grays CJ natural 2011 644 2304 0.12 NA NA 256 11 184 4 199 16 199 0 0 16 110 105 1 0 0 0
Grays CJ natural 2012 644 3475 0.09 576450 0.24 257 21 184 34 239 8 239 0 0 8 177 70 1 0 0 0
Grays CJ natural 2013 644 1925 0.06 1466141 0.04 258 112 106 25 244 16 244 0 0 16 148 115 1 0 0 0
Grays CJ natural 2014 644 1541 0.04 1101634 0.10 259 37 165 11 214 23 214 0 0 23 151 88 1 0 0 0
Grays CJ natural 2015 644 4193 0.08 419369 0.14 260 114 170 14 298 26 298 0 0 26 165 159 1 0 0 0
Grays CJ natural 2016 644 5987 0.06 1155179 0.06 261 31 236 26 294 9 294 0 0 9 165 139 1 0 0 0
Grays CJ natural 2017 965 3681 0.15 544785 0.07 262 96 195 79 371 25 371 0 0 25 206 190 1 0 0 0
Grays CJ natural 2018 965 899 0.02 1259033 0.07 263 122 46 4 172 6 172 0 0 6 102 78 1 0 0 0
Grays CJ natural 2019 965 72 1.17 276500 0.36 264 3 11 0 14 0 14 0 0 0 10 4 1 0 0 0
Grays CJ natural 2020 965 2863 0.10 8189 0.10 265 157 55 3 215 5 215 0 0 5 128 92 1 0 0 0
Grays CJ natural 2021 965 6279 0.10 1236534 0.06 266 190 141 2 333 12 333 0 0 12 171 175 1 0 0 0
Grays CJ natural 2022 965 NA NA 753589 0.13 267 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0


Column names generally follow the mathematical notation; pop_type is a dummy indicator to distinguish natural and hatchery populations, and downstream_trap gives the row index corresponding to the downstream smolt trap that captures outmigrants from the given pop and year. In this case, Grays CJ smolts are also counted at a local trap. The origin-frequency columns are indexed by the numeric values of the factor levels corresponding to hatchery populations; for example, levels(fish_data$pop)[14] is Grays Hatchery, while n_origin0_obs corresponds to natural spawners of unknown origin.

Second, fecundity_data contains observations of individual fecundity indexed by hatchery female. Here are the first few rows (as described above, year and ID are not used in modeling):

   year   ID age_E E_obs
1  2009  F-1     3  2322
2  2009  F-6     3  2481
3  2009 F-13     3  2930
4  2009 F-19     3  2817
5  2009 F-20     3  2905
6  2009 F-22     3  1803
7  2009 F-29     3  2284
8  2009 F-32     3  2586
9  2009 F-35     3  1855
10 2009 F-42     3  2533

Retrospective Models

Model Fitting

We can now call the function salmonIPM() to fit models to the historical LCR chum monitoring data. In principle we could fit all three spawner-recruit functions, or models with alternative covariate specifications, and compare them based on estimated out-of-sample predictive performance using the approximate leave-one-out cross-validation score (LOO; Vehtari et al. 2017). In practice we find that LOO estimates for IPMs fitted to the LCR chum data are too unstable to be usable (excessive Pareto k-values indicating that some observations are highly influential on the posterior), requiring brute-force cross-validation or customized methods. These are under development, but in the meantime we proceed to use the Ricker model for inference since it is biologically plausible and fits the data reasonably well. We allow a fixed difference between the average SAR of natural and hatchery populations using the par_models formula interface. The center = FALSE and scale = FALSE arguments prevent the default behavior of centering and standardizing covariates since pop_type is a factor. As with all multi-stage salmonIPM models, we need to specify the age in years of fixed juvenile life stages, in this case smolts (M). The help page ?salmonIPM provides much more detail on available arguments.

fit_Ricker <- salmonIPM(stan_model = "IPM_LCRchum_pp", SR_fun = "Ricker", 
                        par_models = list(s_MS ~ pop_type), 
                        center = FALSE, scale = FALSE, ages = list(M = 1), 
                        fish_data = fish_data, fecundity_data = fecundity_data,
                        log_lik = TRUE, chains = 4, iter = 2000, warmup = 1000,
                        control = list(adapt_delta = 0.95, max_treedepth = 15))
print(fit_Ricker, prob = c(0.05, 0.5, 0.95),
      pars = c("psi","Mmax","eta_year_M","eta_year_MS","eta_pop_p","mu_pop_alr_p","p",
               "p_F","p_origin","tau_M","tau_S","B_rate","E_hat","M","S","s_MS",
               "q","q_F","q_origin","p_HOS","LL","M_downstream"), 
      include = FALSE, use_cache = FALSE)
Inference for Stan model: IPM_LCRchum_pp.
4 chains, each with iter=2000; warmup=1000; thin=1; 
post-warmup draws per chain=1000, total post-warmup draws=4000.

                    mean se_mean    sd        5%       50%       95% n_eff Rhat
mu_E[1]          2595.68    0.52 44.81   2521.22   2595.39   2668.37  7448 1.00
mu_E[2]          2856.67    0.28 24.03   2817.01   2856.64   2895.54  7334 1.00
mu_E[3]          2869.60    0.89 73.09   2750.92   2870.02   2992.54  6693 1.00
sigma_E[1]        510.72    0.41 32.93    460.60    508.71    566.36  6551 1.00
sigma_E[2]        560.39    0.19 16.80    533.47    559.75    588.35  7732 1.00
sigma_E[3]        436.58    0.65 54.19    356.34    431.59    531.73  6933 1.00
delta_NG            0.67    0.00  0.19      0.34      0.69      0.96  3684 1.00
mu_psi              0.47    0.00  0.07      0.36      0.47      0.60  1203 1.00
sigma_psi           0.44    0.01  0.24      0.10      0.41      0.87  1117 1.01
mu_Mmax             7.94    0.05  1.42      6.46      7.62     10.69   847 1.00
sigma_Mmax          1.98    0.02  0.96      0.97      1.73      3.92  1765 1.00
rho_M              -0.12    0.01  0.33     -0.63     -0.13      0.45  1418 1.01
sigma_year_M        0.38    0.00  0.11      0.21      0.37      0.58  1833 1.00
sigma_M             0.43    0.00  0.05      0.36      0.43      0.51  1792 1.00
mu_MS               0.00    0.00  0.00      0.00      0.00      0.00  1016 1.01
beta_MS[1]         -1.12    0.00  0.19     -1.44     -1.12     -0.82  2170 1.00
rho_MS              0.42    0.01  0.25     -0.06      0.45      0.78  1148 1.01
sigma_year_MS       1.07    0.01  0.24      0.76      1.03      1.51   290 1.01
sigma_MS            0.81    0.00  0.06      0.71      0.81      0.91   817 1.01
mu_p[1]             0.26    0.00  0.02      0.23      0.26      0.29  1335 1.00
mu_p[2]             0.70    0.00  0.02      0.67      0.70      0.73  1590 1.00
mu_p[3]             0.04    0.00  0.00      0.03      0.04      0.05   822 1.00
sigma_pop_p[1]      0.18    0.01  0.15      0.01      0.14      0.47   670 1.01
sigma_pop_p[2]      0.11    0.00  0.10      0.01      0.08      0.29   746 1.01
R_pop_p[1,1]        1.00     NaN  0.00      1.00      1.00      1.00   NaN  NaN
R_pop_p[1,2]        0.29    0.01  0.57     -0.81      0.42      0.97  1688 1.00
R_pop_p[2,1]        0.29    0.01  0.57     -0.81      0.42      0.97  1688 1.00
R_pop_p[2,2]        1.00    0.00  0.00      1.00      1.00      1.00  4097 1.00
sigma_p[1]          1.70    0.00  0.13      1.50      1.70      1.92  1116 1.00
sigma_p[2]          0.88    0.00  0.09      0.74      0.87      1.03   926 1.00
R_p[1,1]            1.00     NaN  0.00      1.00      1.00      1.00   NaN  NaN
R_p[1,2]            0.79    0.00  0.05      0.69      0.79      0.86  1597 1.00
R_p[2,1]            0.79    0.00  0.05      0.69      0.79      0.86  1597 1.00
R_p[2,2]            1.00    0.00  0.00      1.00      1.00      1.00  3883 1.00
mu_F                0.51    0.00  0.02      0.47      0.51      0.54  2413 1.00
sigma_pop_F         0.23    0.00  0.08      0.13      0.22      0.36  1657 1.00
sigma_F             0.40    0.00  0.04      0.34      0.40      0.46  1789 1.00
mu_tau_M            0.08    0.00  0.01      0.06      0.08      0.10  5936 1.00
sigma_tau_M         1.10    0.00  0.11      0.94      1.09      1.28  6654 1.00
mu_tau_S            0.11    0.00  0.01      0.10      0.11      0.12  6879 1.00
sigma_tau_S         0.96    0.00  0.05      0.88      0.96      1.05  4532 1.00
lp__           -46899.42    1.29 42.78 -46972.03 -46898.25 -46828.39  1106 1.00

Samples were drawn using NUTS(diag_e) at Sat Jul  1 19:51:38 2023.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

This model takes approximately 6 hr to fit on a 3-GHz multicore Intel processor. The results (showing only hyperparameters, except p_origin, for readability) suggest that the Markov chains mixed adequately, since the Gelman-Rubin \(\widehat{R}\) diagnostics are all ≤ 1.02 and effective sample sizes \(n_\text{eff}\) are sufficient for computing 90% credible intervals. We may get false-positive warnings that some Rhat and n_eff estimates are NaN, but these are for the fixed diagonal elements of the 2 \(\times\) 2 correlation matrices R_pop_p and R_p parameterized by \(\rho_{\mathbf{p}}^\text{pop}\) and \(\rho_{\mathbf{p}}\), respectively. We could further explore the chains and HMC diagnostics using interactive visualizations in the shinystan package.

Freshwater Productivity

[OLD multiplot TEXT NEEDS TO BE MERGED]

Fecundity

Fecundity was higher for 4- and 5-year old females than for the youngest age class. Here the histograms show observations and the lines and shading show the posterior median and 90% credible interval, respectively, of the estimated zero-truncated normal density, which fits the data reasonably well.

Egg-to-Smolt Survival and Capacity

Spawner-recruit parameters at population and ESU level, showing populations that have direct or indirect smolt abundance observations…

Spawner-Recruit Functions

We can look more closely at freshwater productivity by plotting the spawner-recruit functions for each population along with the observations and states. Here the lines and dark inner shading indicate the posterior median and 90% credible interval of the fitted Ricker curve, and the light outer shading represents total (shared plus unique) process error. (Due to the high reported precision of the observations, observation error makes only a minor contribution to the posterior predictive distribution and is not shown.) The black points are the spawner and smolt data for the five populations with smolt traps. Note that the trap counts for Grays MS represent the sum of production in the mainstem and in the upstream Grays CJ and Grays WF populations, as discussed above, and thus the observations in that case are not directly comparable to the states and fitted values. Open gray points are state estimates, with error bars (in some cases truncated for clarity) indicating 90% credible intervals.

[S-R PLOT UNDER CONSTRUCTION]

The data are sparse, but overall the Ricker appears to be an adequate fit. The states with noticeably higher uncertainty generally correspond to cases with missing spawner observations and/or missing observation error estimates, as we will see in the time series plots below.

Smolt Abundance

Fits to the smolt abundance data, following the same conventions as for spawner abundance, immediately show how much sparser the juvenile sampling is. Only six populations are monitored with smolt traps (including Grays WF, which along with Grays CJ is incorporated into the summed smolt observations and estimates for the downstream Grays MS trap) and several of those time series are incomplete. For that reason, a greater share of the information about the underlying state comes from the process model, which induces synchronous fluctuations.

Marine Survival

The posterior of the regression coefficient beta_MS[1] is strongly negative. Since the levels of pop_type are ordered c("natural", "hatchery"), this indicates that hatchery SAR was lower on average for the 3 hatcheries than for the 12 wild populations. The respective ESU-level hyper-means of smolt-to-adult survival, summarized by posterior median and 90% credible interval, are 0.043% [0.028%, 0.066%] for the wild populations and (0.131% [0.098%, 0.179%]) for the hatcheries.

To visualize this difference, we can compare the time series of ESU-level (posterior medians and 90% credible intervals) and population-level (posterior medians only) SAR for natural and hatchery populations. The two groups are highly correlated, because variation in SAR is dominated by the common trend eta_year_MS. This also allows information to be shared between the groups via the hyperparameters of the SAR process model, so the inclusion of the hatcheries in the model improves the estimates for wild populations. As mentioned previously, the lower apparent survival of hatchery smolts could reflect a greater tendency to stray outside the modeled set \(\mathcal{W}\) in addition to biological differences due to hatchery rearing (Naish et al. 2008).

Hatchery Adult Dispersal

Plotting the posterior distribution of the hatchery straying matrix (posterior medians, 50% and 90% credible intervals) reveals broad overlap between Duncan and Lewis Hatcheries in their spatial distribution of adult recruits, while Grays Hatchery fish returned exclusively to the Grays subbasins. The estimates for Grays Hatchery are much tighter due to the larger number of recoveries for the origin-composition likelihood, especially compared to the sparse observations of Lewis Hatchery fish.

Spawner Abundance and Composition

The next plot shows observed and estimated total spawner abundance for each natural population. Points are “data”, i.e. posterior medians from the sample-based observation models, with associated 90% credible intervals (sometimes too narrow to be visible). Filled points indicate observations whose error SD is known, while the SD for open points is imputed. The solid gray line is the posterior median of the estimated true state from the IPM. Posterior 90% credible intervals indicate process (dark shading) and observation (light shading) uncertainty, i.e. the posterior predictive distribution (PPD) of hypothetical replicate data under the model. Leading or trailing years without spawner abundance estimates are included if smolt abundance is available.

Observation Error

To understand how the IPM imputes the abundance observation error SD in cases where it is not reported, we can plot the lognormal hyperdistribution (posterior median and 90% credible interval) fitted to the known SD values (histogram).

Spawner Age Composition

Next we compare the estimated true natural spawner age-frequencies (posterior medians and 90% credible intervals of the states and of the binomial PPD for hypothetical replicate data) to the observed sample proportions (with 90% binomial confidence intervals). Age composition varies considerably across populations and through time, tracking fluctuations in cohort strength. The precision of the estimated states and the PPD follows the sample size, with smaller samples giving relatively more weight to the hierarchical process model.

Spawner Sex Ratio

Plots of the estimated natural adult sex ratio, following a similar format, likewise show strong interannual fluctuations and some systematic deviations from a 1:1 sex ratio across populations.

Proportion of Hatchery-Origin Spawners

The plot below shows the estimated true states and PPD of \(p_\text{HOS}\), summed across hatchery programs, with corresponding sample proportions and binomial 90% confidence intervals. The proportion of hatchery-origin spawners was generally low, with some very small origin-frequency sample sizes that cannot definitively rule out higher values. There is some noticeable lack of fit, especially in Crazy Johnson Creek, due to the relatively parsimonious model of hatchery adult dispersal (e.g., time-invariant straying probabilities), but overall the PPD based on the hatchery smolt-to-adult process model closely matches the observed data.

Hatchery Scenario Analysis

Modeling the smolt-to-adult transition of the hatchery life cycle enables us to construct prospective scenarios representing alternative futures for supplementation as a recovery strategy. Here we present a comparison between two simple scenarios: (H0) hatchery operations cease, so future broodstock removal and smolt releases are zero; or (Hmax) hatcheries are operated consistently at the highest broodstock removal rates from each population and the highest releases from each program that have been observed in the available data.

We compared supplementation policies over a 20-year (approximately 5-generation) time horizon. Because the hatchery control rules are fixed rather than state-dependent, it is straightforward to use the IPM to simulate future population dynamics. We can simply pad fish_data with rows corresponding to future years in each population and re-fit the model. Estimated states will then include the future years, just as they would any past years with missing data. This seamless integration of retrospective fitting and forward projection is a major advantage of the IPM approach; there is no need to separately specify initial conditions or devise ad hoc methods for simulating parameter uncertainty or process noise.

In this case we have two padded data frames, fish_data_foreH0 and fish_data_foreHmax, corresponding to the two scenarios. Inspecting the final year for each natural or hatchery population in the latter data frame shows the control variables used in the Hmax scenario. Note that we use the fishery mortality F_rate to represent future broodstock removal rates.

                pop      recovery_pop F_rate
1  Hamilton Channel       Lower Gorge  0.067
2    Hamilton Creek       Lower Gorge  0.000
3              Ives       Lower Gorge  0.205
4       Hardy Creek       Lower Gorge  0.000
5    Duncan Channel       Lower Gorge  0.000
6         Horsetail       Lower Gorge  0.161
7          St Cloud       Lower Gorge  0.186
8         Multnomah       Lower Gorge  0.215
9             I-205 I-205 / Washougal  0.070
10         Grays CJ   Grays / Chinook  0.000
11         Grays MS   Grays / Chinook  0.156
12         Grays WF   Grays / Chinook  0.000
              pop  M_obs
1 Duncan Hatchery 217436
2  Lewis Hatchery 110502
3  Grays Hatchery 398000

Estimated states generated by the scenario analysis are summarized at the modeled demographic population level and are also aggregated up to larger management units that are used to define recovery targets and quasi-extinction thresholds (Lower Columbia Fish Recovery Board 2010 Table 6-6). Only the results at this latter “recovery population” level are presented below.

       recovery_pop target QET
1       Lower Gorge   2000  50
2 I-205 / Washougal   1300  50
3   Grays / Chinook   1600  50

We fit the IPM to each augmented data set. Posterior summaries are not shown, but can be printed as demonstrated above for the fit_Ricker model to verify that the sampler apparently converged and that the hyperparameter estimates are the same (up to Monte Carlo error) as in the retrospective case.

foreH0_Ricker <- salmonIPM(stan_model = "IPM_LCRchum_pp", SR_fun = "Ricker", 
                           par_models = list(s_MS ~ pop_type), 
                           center = FALSE, scale = FALSE, ages = list(M = 1), 
                           fish_data = fish_data_foreH0, 
                           fecundity_data = fecundity_data,
                           chains = 4, iter = 2000, warmup = 1000,
                           control = list(adapt_delta = 0.95, max_treedepth = 15))
foreHmax_Ricker <- salmonIPM(stan_model = "IPM_LCRchum_pp", SR_fun = "Ricker", 
                             par_models = list(s_MS ~ pop_type), 
                             center = FALSE, scale = FALSE, ages = list(M = 1), 
                             fish_data = fish_data_foreHmax, 
                             fecundity_data = fecundity_data,
                             chains = 4, iter = 2000, warmup = 1000,
                             control = list(adapt_delta = 0.95, max_treedepth = 15))

In addition to summarizing quantities of management interest across the full posterior, we present estimates conditional on future environmental conditions. Specifically, we use time-averaged SAR anomalies at the ESU level as a proxy for large-scale (likely oceanographic) factors that drive the majority of process error in population dynamics. The figure below illustrates this procedure. For each trajectory sampled from the posterior (a subsample of 100 draws is shown) we compute the future mean of the shared SAR anomalies \(\eta^\text{year}_{MSt}\). The posterior distribution of the means \(\overline{\eta^\text{year}_{MS}}\) is shown on the right margin of the top panel, divided into tertiles representing relatively low, medium or high SAR anomalies over the forecast period; trajectories in each tertile are likewise color-coded. The bottom panel shows trajectories of spawner abundance \(S_{jt}\) for one population, Hardy Creek (the observations are as shown above for the retrospective fit), with the forecast period color-coded according to the corresponding tertile of mean SAR anomaly. The posterior distributions of geometric mean abundance for trajectories in each level of SAR are on the right margin. Other quantities, such as quasi-extinction risk, can be similarly conditioned on categories of average future SAR.

Spawner Abundance

Geometric mean spawner abundance (points are posterior medians; lines are 50% and 90% credible intervals) responds as expected to variation in the average future SAR anomaly, but there is little if any difference between the H0 and Hmax scenarios. This suggests that losses due to broodstock collection are balanced by the addition of hatchery-origin spawners There are, however, some differences in the tails; in I-205 / Washougal, for example, the lower tails are thinner under Hmax. The reasons for and implications of these subtle hatchery effects will become apparent when we consider the probability of quasi-extinction.

Proportion of Hatchery-Origin Spawners

Under the Hmax scenario, time-averaged future \(p_\text{HOS}\) is highest under low-SAR conditions, which elevate the probability of a decline in natural spawners while hatchery smolt releases are assumed to remain constant. Average future \(p_\text{HOS}\) in the I-205 population (the only one in the I-205 / Washougal recovery population) is higher than has been observed historically; compare the plots of the data and retrospectively estimated states. This reflects the relatively large influx of Lewis Hatchery-origin adults in tandem with a potential decline in natural spawners. Predicted future \(p_\text{HOS}\) in the other two recovery populations is broadly consistent with past observations.

Quasi-Extinction Risk

We computed quasi-extinction risk as the probability that the 4-year (approximately one generation) running geometric mean spawner abundance falls below the quasi-extinction threshold (QET = 50) at least once during the forecast period (McElhany et al. 2000, Lower Columbia Fish Recovery Board 2010 Table E12-2). The probability of quasi-extinction (PQE) is shown below, with 90% credible intervals based on the binomial sample of TRUE / FALSE outcomes. Simulated hatchery supplementation generally reduces PQE, with the possible exception of the Lower Gorge recovery population under low-SAR conditions. The absolute reduction is most dramatic in the I-205 population, especially when average SAR is low, although the relative difference (≈ 50%) is similar in the Grays / Chinook group.

The reduction in PQE under supplementation is due to the buffering effect of relatively stable numbers of hatchery-origin spawners \(S_\text{H}\). An important caveat to these results, and indeed all projections from this scenario analysis, is that we assumed hatchery smolt releases remain constant despite fluctuations in the number of broodstock taken from wild populations under the constant-rate policy. In reality, broodstock collection and thus juvenile releases will decline if \(S_\text{W}\) falls to critically low levels, so the buffering effect will be reduced when it is most needed (e.g., under persistently low-SAR conditions). Thus the results presented here likely represent a best-case scenario for the ability of hatcheries to mitigate quasi-extinction risk. In order to construct a more realistic simulation of hatchery operations, we will first need to complete the process model of the hatchery life cycle so that the numbers of broodstock taken are mechanistically linked to the numbers of smolts produced (see Next Steps).

Probability of Recovery

We also computed the probability of recovery, defined as the 4-year running geometric mean spawner abundance exceeding the recovery population-specific target at least once during the forecast period. Recovery probabilities were nearly indistinguishable from 1 (minimum point estimate, 0.97) for all three recovery populations under both hatchery scenarios and all SAR conditions, so results are not shown here.

One-Step Ahead Forecast

We can easily generate short-term run forecasts using the same approach of augmenting fish_data with future rows. In fact, for forecast horizons up to 3 years we can simply use the existing foreH0_Ricker model because all returning adults belong to cohorts that have already been born, so are unaffected by assumptions about future hatchery operations. Within a single chum generation, the forecast is informed by the observed abundance of parent spawners and outmigrants as well as younger recruit age classes whose older siblings have yet to return. Thus the IPM automatically incorporates the information used in “sibling regression” models (Peterman 1982), but without the restrictive assumption of constant within-cohort age structure.

This table shows pre-season forecasts of total spawners for 2023 along with current habitat size and estimated freshwater productivity parameters for each population (posteriors are summarized by median and 90% credible intervals).

Reporting Metrics

As shown in the figures above, the LCR chum IPM currently generates estimates (or forecasts) of a number of management-relevant quantities and reporting metrics:

  • Total spawner abundance by return year
  • Spawner rearing-type composition (wild / hatchery by program)
  • Spawner age composition
  • Spawner sex composition
  • Mining rate (fraction of natural spawners removed for broodstock)
  • Juvenile abundance at monitored and unmonitored sites
  • Natural and hatchery SAR
  • Natural recruits per spawner

We anticipate that future model development will add to this list, particularly for metrics related to hatchery program effectiveness (e.g., proportion of natural-origin broodstock, in-hatchery egg-to-smolt survival, and adult recruits per broodstock).

Next Steps

The LCR chum IPM remains under active development, with an ultimate goal of providing decision support for adaptive management of habitat restoration, hatchery supplementation, and monitoring design. Toward those ends, our highest priorities for further model development in the near and medium term include the following:

  1. Complete the hatchery life cycle. Building on our model of the smolt-to-adult stage of hatchery populations and dispersal of hatchery-origin recruits, we will complete the life cycle by explicitly accounting for natural- and hatchery-origin broodstock transfers into hatcheries and their egg-to-smolt survival. This will permit a direct comparison between adult-to-adult productivity of naturally reproducing and hatchery-spawned chum, provide model-based estimates of the proportion of natural broodstock (\(p_\text{NOB}\)) and proportion natural influence (\(p_\text{NI}\)), and enable scenario analyses involving modifications to specific hatchery programs.

  2. Model spawning channel operations. The hatchery smolt-to-adult and broodstock-to-smolt life cycle components currently under development only apply to facilities where fish are spawned artificially and a known number of smolts are released. By contrast, some spawning channels (specifically Duncan Channel) resemble hatcheries in that adults from other populations are deliberately translocated into such sites, but resemble natural populations in that fish spawn naturally and the abundance of juvenile outmigrants must be estimated by trapping. This hybrid situation presents additional code implementation challenges, but a full adult-to-adult operational model of spawning channels is necessary in order to evaluate their effectiveness as a restoration strategy.

  3. Expand the data set. In addition to the 12 populations currently included in the IPM, adult abundance and composition data (age, sex and origin) are available for additional, less intensively monitored Washington populations of LCR chum. Incorporating these populations into the model would more fully represent the ESU and improve the estimates of parameters governing shared processes such as smolt production, smolt-to-adult survival, and straying of hatchery-origin fish. The inclusion of these data would also enable scenario analyses that evaluate recovery targets relevant to delisting criteria for the entire ESU, including all strata that historically existed.

  4. Include habitat and environmental drivers. Incorporate covariates (e.g., freshwater habitat quality, oceanographic factors) into the IPM to estimate effects on stage-specific productivity, capacity, or survival and identify potential limiting factors. For example, initial exploratory analysis suggests winter flow may drive variation in egg-to-fry survival. Such relationships could be used to examine population viability under future climate change scenarios. Likewise, predictors of marine survival could reduce the unexplained process noise associated with SAR and improve the precision and accuracy of short-term forecasts.

  5. Scenario analyses to evaluate restoration strategies. The features implemented in the LCR chum IPM to date or proposed for FFY2023-24 will enable us to simulate a range of realistic management interventions, in particular population enhancement (hatchery supplementation and adult translocation) and habitat restoration. Some population enhancement simulations are possible already, as described above — namely, varying the broodstock removal rate and number of smolts released. However, since the operational management action is the transfer of broodstock into hatcheries, completing the adult-to-smolt stage of the hatchery life cycle will provide a more relevant “dial” in the model. Habitat-related simulations could take two forms. First, we could explore future ranges of empirically fitted environmental drivers, e.g. winter flows under climate change. Second, we could simulate freshwater habitat restoration or enhancement strategies such as the construction of artificial channels, in a simplistic way, by increasing egg-to-smolt survival or capacity in specific populations. In any of these scenario analyses, outcomes could be evaluated based on metrics such as abundance, productivity, \(p\text{HOS}\), \(p_\text{NOB}\), PQE, and recovery probability. Furthermore, we could examine the time needed to reach management reference points. For instance, following habitat restoration, does hatchery supplementation or adult translocation significantly reduce the time it takes a population to achieve an abundance target with a given probability, relative to natural recovery?

References

Buhle, E. R., M. D. Scheuerell, T. D. Cooney, M. J. Ford, R. W. Zabel, and J. T. Thorson. 2018. Using integrated population models to evaluate fishery and environmental impacts on Pacific salmon viability. U.S. Department of Commerce, NOAA Technical Memorandum NMFS-NWFSC-140. DOI: 10.7289/V5/TM-NWFSC-140.
Fleischman, S. J., M. J. Catalano, R. A. Clark, and D. R. Bernard. 2013. An age-structured state-space stock-recruit model for Pacific salmon (oncorhynchus spp.). Canadian Journal of Fisheries and Aquatic Sciences 70:401–414.
Lower Columbia Fish Recovery Board. 2010. Washington Lower Columbia Salmon Recovery and Fish & Wildlife Subbasin Plan LCFRB.
Maunder, M. N., and A. E. Punt. 2013. A review of integrated analysis in fisheries stock assessment. Fisheries Research 142:61–74.
McElhany, P., M. H. Ruckelshaus, M. J. Ford, T. C. Wainwright, and E. P. Bjorkstedt. 2000. Viable salmonid populations and the recovery of evolutionarily significant units. Page 156. U.S. Dept. Commer., NOAA Tech. Memo. NMFS-NWFSC-42.
Monnahan, C. C., J. T. Thorson, and T. A. Branch. 2017. Faster estimation of Bayesian models in ecology using Hamiltonian Monte Carlo. Methods in Ecology and Evolution 8:339–348.
Naish, K. A., J. E. Taylor, P. S. Levin, T. P. Quinn, J. R. Winton, D. Huppert, and R. Hilborn. 2008. An evaluation of the effects of conservation and fishery enhancement hatcheries on wild populations of salmon. Advances in Marine Biology 53:61–194.
Newman, K. B., S. T. Buckland, S. T. Lindley, L. Thomas, and C. Fernandez. 2006. Hidden process models for animal population dynamics. Ecological Applications 16:74–86.
Peterman, R. 1982. Model of salmon age structure and its use in preseason forecasting and studies of marine survival. Canadian Journal of Fisheries and Aquatic Sciences 39:1444–1452.
Schaub, M., and F. Abadi. 2011. Integrated population models: A novel analysis framework for deeper insights into population dynamics. Journal of Ornithology 152:227–237.
Stan Development Team. 2019. Stan Modeling Language Users Guide and Reference Manual 2.27.
Su, Z., and R. M. Peterman. 2012. Performance of a Bayesian state-space model of semelparous species for stock-recruitment data subject to measurement error. Ecological Modelling 224:76–89.
Valpine, P. de, and R. Hilborn. 2005. State-space likelihoods for nonlinear fisheries time-series. Canadian Journal of Fisheries and Aquatic Sciences 62:1937–1952.